Privacy-Preserving Evaluation of Generalization Error and Its Application to Model and Attribute Selection

نویسندگان

  • Jun Sakuma
  • Rebecca N. Wright
چکیده

Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Three Parameter Interval Grey Numbers in Enterprise Resource Planning Selection

This paper applies a new multi attribute decision-making (MADM) model to help companies for enterprise resource planning (ERP) selection problem based on Balanced Score Card method. This paper uses three-parameter interval grey numbers which is derived from Grey theory (was proposed by J. Deng). This numbers is used instead of linguistic variables. Beside, a new weighting method that outcomes f...

متن کامل

A Novel Anonymity Algorithm for Privacy Preserving in Publishing Multiple Sensitive Attributes

Publishing the data with multiple sensitive attributes brings us greater challenge than publishing the data with single sensitive attribute in the area of privacy preserving. In this study, we propose a novel privacy preserving model based on k-anonymity called (α, β, k)-anonymity for databases. (α, β, k)anonymity can be used to protect data with multiple sensitive attributes in data publishing...

متن کامل

Application of Three Parameter Interval Grey Numbers in Enterprise Resource Planning Selection

This paper applies a new multi attribute decision-making (MADM) model to help companies for enterprise resource planning (ERP) selection problem based on Balanced Score Card method. This paper uses three-parameter interval grey numbers which is derived from Grey theory (was proposed by J. Deng). This numbers is used instead of linguistic variables. Beside, a new weighting method that outcomes f...

متن کامل

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Attribute-based Access Control for Cloud-based Electronic Health Record (EHR) Systems

Electronic health record (EHR) system facilitates integrating patients' medical information and improves service productivity. However, user access to patient data in a privacy-preserving manner is still challenging problem. Many studies concerned with security and privacy in EHR systems. Rezaeibagha and Mu [1] have proposed a hybrid architecture for privacy-preserving accessing patient records...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009